A Hierarchy with, of, and for Preposition Supersenses
نویسندگان
چکیده
English prepositions are extremely frequent and extraordinarily polysemous. In some usages they contribute information about spatial, temporal, or causal roles/relations; in other cases they are institutionalized, somewhat arbitrarily, as case markers licensed by a particular governing verb, verb class, or syntactic construction. To facilitate automatic disambiguation, we propose a general-purpose, broadcoverage taxonomy of preposition functions that we call supersenses: these are coarse and unlexicalized so as to be tractable for efficient manual annotation, yet capture crucial semantic distinctions. Our resource, including extensive documentation of the supersenses, many example sentences, and mappings to other lexical resources, will be publicly released. Prepositions are perhaps the most beguiling yet pervasive lexicosyntactic class in English. They are everywhere; their functional versatility is dizzying and largely idiosyncratic (1). They are nearly invisible, yet indispensable for situating the where, when, why, and how of events. In a way, prepositions are the bastard children of lexicon and grammar, rising to the occasion almost whenever a noun-noun or verbnoun relation is needed and neither subject nor object is appropriate. Consider the many uses of the word to, just a few of which are illustrated in (1):1 (1) a. My cake is to die for. b. If you want I can treat you to some. c. How about this: you go to the store d. to buy ingredients. e. Then if you give the recipe to me f. I’m happy to make the batter g. and put it in the oven for 30 to 40 minutes h. so you’ll arrive to the sweet smell of chocolate. i. That sounds good to me. j. That’s all there is to it. 1Though infinitival to is traditionally not considered a preposition, we allow it to be labeled with a supersense if the infinitival clause serves as a PURPOSE (as in (1d)) or FUNCTION. See §2. Sometimes a preposition specifies a relationship between two entities or quantities, as in (1g). In other scenarios it serves a case-marking sort of function, marking a complement or adjunct—principally to a verb (1b–1e, 1h, 1i), but also to an argument-taking noun or adjective (1f). Further, it is not always possible to separate the semantic contribution of the preposition from that of other words in the sentence. As amply demonstrated in the literature, prepositions play a key role in multiword expressions (Baldwin and Kim, 2010), as in (1a, 1b, 1j). An adequate descriptive annotation scheme for prepositions must deal with these messy facts. Following a brief discussion of existing approaches to preposition semantics (§1), this paper offers a new approach to characterizing their functions at a coarsegrained level. Our scheme is intended to apply to almost all preposition tokens, though some are excluded on the grounds that they belong to a larger multiword expression or are purely syntactic (§2). The rest of the paper is devoted to our coarse semantic categories, supersenses (§3).2 Many of these categories are based on previous proposals—primarily, Srikumar and Roth (2013a) (so-called preposition relations) and VerbNet (thematic roles; Bonial et al., 2011; Hwang, 2014, appendix C)—but we organize them into a hierarchy and motivate a number of new or altered categories that make the scheme more robust. Because prepositions are so frequent, so polysemous, and so crucial in establishing relations, we believe that a wide variety of NLP applications (including knowledge base construction, reasoning about events, summarization, paraphrasing, and translation) stand to benefit from automatic disambiguation of preposition supersenses. 2Supersense inventories have also been described for nouns and verbs (Ciaramita and Altun, 2006; Schneider et al., 2012; Schneider and Smith, 2015) and adjectives (Tsvetkov et al., 2014). Other inventories characterize semantic functions expressed via morphosyntax: e.g., tense/aspect (Reichart and Rappoport, 2010), definiteness (Bhatia et al., 2014, also hierarchical).
منابع مشابه
A corpus of preposition supersenses in English web reviews
We present the first corpus annotated with preposition supersenses, unlexicalized categories for semantic functions that can be marked by English prepositions (Schneider et al., 2015). That scheme improves upon its predecessors to better facilitate comprehensive manual annotation. Moreover, unlike the previous schemes, the preposition supersenses are organized hierarchically. Our data will be p...
متن کاملA Corpus of Preposition Supersenses
Function tags mapped to fewer than 20 supersense-tagged prepositions overall are not displayed. (This accounts for why the bars are not strictly decreasing in width.) Numbered arguments tagged with VPC are mapped to Direction in 8 instances. AM-LVB is mapped to Purpose in 8 instances, while AM-CXN is the dominant function tag mapped to Scalar/Rank (18 instances). Distribution of PropBank functi...
متن کاملThe Preposition Corpus in Sketch Engine
Corpora from the Pattern Dictionary of English Prepositions (PDEP) provide the basis for examining the behavior of 304 English prepositions, with 1040 senses (patterns) describing in 20 fields. The PDEP corpora comprise dependency parses for 81,509 sentences in CoNLL-X format, previously used in several studies, particularly used for disambiguation modeling. We have now put these parse data int...
متن کاملAutomatic identification of semantic relations in Italian complex nominals
This paper addresses the problem of the identification of the semantic relations in Italian complex nominals (CNs) of the type N+P+N. We exploit the fact that the semantic relation, which is underspecified in most cases, is partially made explicit by the preposition. We develop an annotation framework around five different semantic relations, which we use to create a corpus of 1700 Italian CNs,...
متن کاملCoverage And Inheritance In The Preposition Project
In The Preposition Project (TPP), 13 prepositions have now been analyzed and considerable data made available. These prepositions, among the most common words in English, contain 211 senses. By analyzing the coverage of these senses, it is shown that TPP provides potentially greater breadth and depth than other inventories of the range of semantic roles. Specific inheritance mechanisms are deve...
متن کامل